Introduction

This document records the construction of the Index.

Conceptual Framework

Bsed on available indicators in the country, the conceptual framework can be represented in the diagramme below:

In details, the composite indicators can be described below:

#> ----------
#> Your data:
#> ----------
#> Input:
#>   Units: 340 (GT0101, GT0102, GT0103, ...)
#>   Indicators: 41 (InsegAlim, Dormitoria, VivCollectCalle, ...)
#> 
#> Structure:
#>   Level 1 Indicator: 41 indicators (DesastresNat, Deportadas, Refugiados, ...) 
#>   Level 2 Category: 12 groups (Desastres, Desplaz, Violencia, ...) 
#>   Level 3 Dimension: 3 groups (Amenazas, Cap_Resp, Sit_SocEc) 
#>   Level 4 Index: 1 groups (MVI)

Data treatment

For convenience the main steps (Data treatment; Normalisation and Aggregation) have now been condensed down and simplified to a function that look at:

  • Outlier treatment : Outlier treatment aims to adjust the distributions of highly skewed, or fat tailed indicators, including cases where there are outliers that are not characteristic of the rest of the distribution. This is done to improve the discriminatory power of the indicator in aggregation. For more on this, see here.

  • skew and kurtosis : If the absolute skew is greater than 2 AND the kurtosis is greater than 3.5, data treatment is applied (step 3 onwards), else leave the indicator as it is and move back to 1 for the next indicator.

  • Winsorisation : up to a maximum number of five points. Check after each Winsorised point whether skew and kurtosis fall back within the limits specified. If so, apply no further data treatment and move to the next indicator. If the maximum number of Winsorised points is reached and skew and kurtosis are still outside the thresholds, “undo” any Winsorised points and apply a log transformation.

A number of indicators might require data treatment. To deal with this we follow a standard procedure is built as a default

Now Building the index

We can see that most indicators have been dealt with by applying a log transformation as expected, whereas a few have been Winsorised. In total, after treatment four indicators still fall outside the skew/kurtosis limits. We will check these visually:

This shows a problem: that one of the indicators is unusually negatively skewed. In this case, applying a log transformation won’t work because that corrects for positive skew. To deal with this I have encoded a function in COINr which can deal with negative skew as well, and this is invoked here. In fact, it checks the direction of skew and applies the correct transformation.

Now let’s check the outcome. We just focus on “AccElectr” here which is the problematic indicator:

This demonstrates the effectiveness of the new transformation: it has normalised the indicator but retaining its ordering. The scale of the indicator is now different (as with all transformations) but this is not important since indicators will anyway be scaled between 0-100 in the following step, and the scaling and transformation is only for the purposes of aggregation. When presenting individual indicators, we will of course present the real data.

  • Normalise : Following this we can normalise the indicators using a standard per default min-max approach. This scales each indicator onto the \([0,100]\) interval.

  • Aggregate : Now we create aggregate levels by aggregating up to the index. Different options will be used and generated for further comparison by field experts

Results

Our first view of the results is as a results table. The table is sorted by default from the highest scoring (most vulnerable) municipalities downwards, based on the Index scores for different scenario.

Scenario 1: Arithmetic mean

These results should be checked to see whether they agree with common sense. Another way of looking at the results is in a bar chart. Here, since we have a lot of municipalities I will just plot the top thirty. They are coloured by departamento.

We can plot the same chart but broken down by Dimension scores - this can give a view of how much each dimension contributes to the total score.

As a last view of the results (for the moment), we can plot a choropleth map. This is based on the municipal shape files.

Scenario 2: Geometric mean

One issue to address when aggregating indicators is related to the concept of compensability. Namely the question is to know to what extent can we accept that the high score of an indicator go to compensate the low score of another indicator?

Geometric aggregation allows to bypass the full compensability hypothesis.

These results should be checked to see whether they agree with common sense. Another way of looking at the results is in a bar chart. Here, since we have a lot of municipalities I will just plot the top thirty. They are coloured by departamento.

We can plot the same chart but broken down by Dimension scores - this can give a view of how much each dimension contributes to the total score.

As a last view of the results (for the moment), we can plot a choropleth map. This is based on the municipal shape files.

Scenario 3: Benefit of the Doubt

This method is the application of Data Envelopment Analysis (DEA) to the field of composite indicators. It was originally proposed by Melyn and Moesen (1991) to evaluate macroeconomic performance. ACAPS has prepared an excellent note on The use of data envelopment analysis to calculate priority scores in needs assessments.

BoD approach offers several advantages:

  • Weights are endogenously determined by the observed performances and benchmark is not based on theoretical bounds, but it’s a linear combination of the observed best performances.

  • Principle is easy to communicate: since we are not sure about the right weights, we look for ”benefit of the doubt” weights (such that your overall relative performance index is as high as possible).

These results should be checked to see whether they agree with common sense. Another way of looking at the results is in a bar chart. Here, since we have a lot of municipalities I will just plot the top thirty. They are coloured by departamento.

We can plot the same chart but broken down by Dimension scores - this can give a view of how much each dimension contributes to the total score.

As a last view of the results (for the moment), we can plot a choropleth map. This is based on the municipal shape files.

Scenario 4: Dendric method

Dendric method (also known as the Wroclaw Taxonomic Method ), originally developed at the University of Wroclaw, is based on the distance from a theoretical unit characterized by the best performance for all indicators considered.

The final composite indicator is therefore based on the sum of euclidean distances from the ideal unit and normalized by a measure of variability of these distance (mean + 2*std).

These results should be checked to see whether they agree with common sense. Another way of looking at the results is in a bar chart. Here, since we have a lot of municipalities I will just plot the top thirty. They are coloured by departamento.

We can plot the same chart but broken down by Dimension scores - this can give a view of how much each dimension contributes to the total score.

As a last view of the results (for the moment), we can plot a choropleth map. This is based on the municipal shape files.

Uncertainty Analysis

Field experts “Reality Check”

“Reality check” of the results with the Country Panel Expert can be facilitated as a last step:

  • Do the results options make sense to field experts?

  • Are there any big gaps in terms of indicators measured?

  • Are there need to reshuffles of indicators/categories ?